Comparing optimization criteria in antibiotic allocation protocols

Clinicians prescribing antibiotics in a hospital context follow one of several possible ‘treatment protocols’—heuristic rules designed to balance the immediate needs of patients against the long-term threat posed by the evolution of antibiotic resistance and multi-resistant bacteria. Several criteria have been proposed for assessing these protocols; unfortunately, these criteria frequently conflict with one another, each providing a different recommendation as to which treatment protocol is best. Here, we review and compare these optimization criteria. We are able to demonstrate that criteria focused primarily on slowing evolution of resistance are directly antagonistic to patient health both in the short and long term. We provide a new optimization criteria of our own, intended to more meaningfully balance the needs of the future and present. Asymptotic methods allow us to evaluate this criteria and provide insights not readily available through the numerical methods used previously in the literature. When cycling antibiotics, we find an antibiotic switching time which proves close to optimal across a wide range of modelling assumptions.


Introduction
Throughout the twentieth century, a small number of discoveries have fundamentally reshaped our world. Transistors have created a world of computation and communication [1][2][3], the Haber-Bosch process for nitrogen fixation has massively increased our ability to produce crops [4,5], and the development of antibiotics [6,7] has not only redefined our battle against disease but also acted as a key enabling technology in surgery and intensive care [8][9][10]. Antibiotics have helped lift life expectancy from 47 at the beginning of the twentieth century, to the 79-year average we see today [11].
At the same time, evolution, that same process that gave birth to penicillin, ensures that antibiotics are a limited resource; every time a given antibiotic is used to save a life or prevent an infection we inevitably select those bacteria most able to resist for survival. Over weeks and years this sustained selective pressure has given rise to a variety of multiresistant 'superbugs' [12]. Evolution of multiresistance is further accelerated by horizontal gene transfer (HGT) [13,14], a process by which one bacterial lineage may share genetic material with another. Bacteria have, over time, developed many forms of antibiotic resistance (ABR), first to penicillin, and then to erythromycin, methicillin, vancomycin and carbapenems [15,16].
While a large number of antibiotics have been developed over the past century, the vast majority act through only a small number of essential mechanisms; as an example amoxicillin, cephalexin, doripenem, meropenem, aztreonam, ceftolozone and many others all act through the same β-lactam group as penicillin [17]. Bacteria that are resistant to one of these compounds will quickly develop enhanced resistance to the others [18]. In the past 30 years, only a few truly novel antibiotic compounds have been developed (for example, complestatin and corbomycin, discovered 2020, [19]), and of those compounds discovered, even fewer have made it to market (see table II of Coates et al. [20] for a list of antibiotics discovered in the past decades, and where they are in the process of clinical trials).
These practical difficulties are further exacerbated by the fact that antibiotic research is unprofitable [21,22]; development of new antiobiotics is hugely costly, yet any new drug is liable to be kept on hospital shelves as a 'reserve' antibiotic, and if it is used, will cure a patient with days or weeks, as opposed to the years or lifetime for drugs designed to treat chronic illnesses, such as cancer, depression or high blood pressure.
Given the substantial human and economic costs caused by resistant and multiresistant bacteria in clinical settings [23], currently including billions of dollars, and tens of thousands of lives, there is substantial interest in understanding how best to slow the evolution and spread of multiresistant genotypes. While there is substantial interest in developing 'evolution proof drugs', targeting bacterial virulence [24], using chemical mechanisms to inhibit resistance mechanisms directly [25], or pairing antibiotics where resistance to one drug implies sensitivity to another [26], for the time being, our primary tool in minimizing ABR is simple antibiotic stewardship: selecting the time and place of antibiotic use so as to minimize selection for resistance.
Antibiotic stewardship has been studied from both a clinical point of view (see the review articles by Dik et al. [27] and Chatzopoulou et al. [28] as well as [29][30][31]) and from a theoretical standpoint (see the literature described below).
The archetypical question asked in this context is 'when and how should clinicians prescribe antibiotics?' Is a hospital better off prescribing a variety of different antibiotics or applying a monoculture, switching the drug of choice every few weeks? Should all patients be prescribed multiple antibiotics (combination therapy) so as to ensure recovery, or is a lighter touch better in the long run? And how, precisely, do we define 'better' anyway? Is it more important to ensure optimal patient outcomes in the present, or slow the development of multiresistant bacteria over an evolutionary timescale?
A first inquiry into these conundrums was made in a simulation study by Bonhoeffer et al. [32]. Subsequent models have built on Bonhoeffer's approach, exploring horizontal gene transfer [33,34], extensions to a larger number of antibiotic variants [35,36] and the effects of stochasticity [37]. Ibargüen-Mondragón et al. use control theory to consider both the effects of antibiotic treatment and immune response [38], while Houy et al. consider the question from the point of view of 'what information do decision makers have access to?' [39]. McCloud et al. explore the importance of linkage disequilibrium, the tendency of two resistance genes to be found together or apart, and model evolution on the scale of multiple inter-connected populations (for the sake of our discussion, one might imagine patients in multiple different hospitals) [40].
Concerningly, among these (and many other) investigations, no consensus on optimal treatment protocols has been reached. Uecker & Bonhoeffer [41] explain these contradictions in their clear, concise and comprehensive review article, in which they provide an overview of the subtle modeling decisions and mismatched optimization criteria that play into various contradictory rankings of antibiotic deployment strategy.
The focus of this article will be following up on a number of questions raised in Uecker & Bonhoeffer's review. We compare the strengths and weaknesses of the various proposed optimization criteria in the literature. We propose a new, more physically justifiable optimization criteria, incorporating both patient health and evolutionary risk, a middle path between the various criteria considered previously. In addition, we find a number of analytic results that will hopefully reduce the royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 need for time-consuming numeric exploration of parameter space, and give somewhat clearer insight into how parameter values combine to determine patient outcomes. For the important case of cyclic treatment protocols, we use our asymptotic results to identify a 'saturation time' t sat which gives 'almost optimal' cycle time across a wide range of assumptions. We observe that optimization criteria focused on delaying multiresistance often harm patient health in both the short and long term: they are often maximized by selecting treatment protocols with patient outcomes worse than the long-term effects of antibioitic resistance.
Our goal in what follows is not to question the orthodoxy of aggresive antibiotic treatment [42]. Instead our sole goal is to examine the methods by which we evaluate antibiotic management protocols, and explain the root cause of a number of conflicting and contradictory results previously published.
In §2, we introduce the basic model and current antibiotic protocols as described in the literature. Section 2.2 introduces four possible optimization criteria, both novel and historic. In §3, we examine the mean number of uninfected patients for a variety of antibiotic protocols and gather several novel analytic approximations for patient health. In §4, we explore the behaviour of four different optimality criteria, and how 'optimal' results change (or don't) depending on which criteria is used and how multiresistance arises. We demonstrate how a number of optimization criteria are directly antagonistic to one another, and indeed, to patient health. These results are summarized in §5. We find combination therapy to be appropriate over a wide range of contexts, and also identify t sat , a cycle time that is 'close to optimal' over a wide range of modelling assumptions. It is our hope that by identifying why different papers have reached such contradictory conclusions, we can guide medical practitioners and future modellers towards better criteria, or at the very least, give them the tools to evaluate which criteria are most relevant.

Model
Let us begin by introducing a small amount of biology and defining the two models of in-hospital ABR spread that we will be building upon. Primarily, we make use of Bonhoeffer et al.'s original model from 1997 [32]. This model has acted as the basis of many of the papers that came after, and hence acts as an excellent testing space for comparing and understanding the various optimization criteria found in the literature. The second model we considered was proposed more recently by Uecker & Bonhoeffer [43] and is designed to overcome one particular simplification in the original model, which will be relevant in our future discussions. This simplifying assumption, and when it is and is not appropriate, will be discussed later.
These models are selected primarily for their analytic tractability, clarity, and foundational position in the literature. Our goal in this paper is to compare differing optimization criteria and resolve conflicting recommendations as cleanly as possible, Bonhoeffer's model is perfect for this. With that said, future researchers may find that more detailed models incorporating more recent clinical understanding of effects, such as horizontal gene transfer [44,45] and bystander selection [46], may be more appropriate when making treatment recommendations. With that said, let us start by describing Bonhoeffer et al.'s 1997 model [32].
Imagine a hospital or hospital ward containing a number of patients. There exists some bacterial infection (for example, Streptococcus pneumoniae or Staphylococcus aureus) that spreads through the patient population and can be treated using one of two frontline antibiotic drugs (such as linezolid or telavancin). For the sake of generality, we refer to these drugs as A and B. Bacterial infections will either be susceptible to treatment, resistant to one or other of the available antibiotics, or doubly resistant. Denote the population of patients infected with each of these classes of infection as S, R A , R B and R AB , respectively. The hospital also includes a population of uninfected patients, X, who are nonetheless 'exposed' to infection. These patients represent patients recovering from surgery (or similar medical conditions unrelated to infection). While infection may provide an additional load, uninfected status does not correspond to a clean bill of health, and conversely, an infected patient may still be healthy enough for discharge if they have recovered from surgery, chemotherapy, or whatever was their primary cause for entering the hospital. It is also important to note that here 'susceptible' and 'exposed' are not used in the same manner as standard epidemiological SEIR models. Here 'susceptible' refers to the infecting bacteria's susceptibility to antibiotic treatment, and 'exposed' refers to a patient's proximity to possible infection, as opposed to the presence of bacterial infection that has not progressed to the 'infectious' stage, as implied in SEIR models.
With these five classes of patients, we now construct a compartment type model [47]. Patients switch from one infected status to another via a variety of processes (figure 1). Patients arrive at the hospital at royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 some rate m X , m S , m A , m B and m AB ; immigration rate for each class will depend on the prevalence of ABR in the community. Patients in all compartments are discharged at some rate μ; this rate is assumed to be equal across all compartments. In appendix A, we relax this simplifying assumption and consider a model that explicitly differentiates between discharge due to death or recovery. The exact details of discharge have minimal impact on all major results, and hence we assume discharge rate μ is independent of infection status, as is traditional in the literature.
Exposed individuals become infected according to mass action kinetics at rates β S S X, β A R A X, β B R B X and β AB R AB X. Differences in infection rate β represent the metabolic cost a bacteria must pay in order to maintain resistance; more resistance comes at a higher cost, hence β S ≥ β A , β B ≥ β AB . All β are assumed to be fairly similar, differing by only 1 or 2%, as opposed to an order of magnitude. Hospital policies that involve isolating ABR infected patients may be used to artificially reduce β AB further [48]. While these strategies can be very effective, they will not be our focus here.
Infected individuals recover naturally at some rate γ, and recover at some higher rate τ + γ if administered suitable antibiotic treatment (hence, administering drug A leads to recovery rate τ + γ in the S and R B population and recovery rate γ in the R A and R AB population.) We model which antibiotics are currently being administered via the indicator functions χ A (t) and χ B (t). These functions take the value zero or one, depending on whether or not the antibiotic in question is being administered at time t.
Single resistance strains are assumed to be prevalent enough in the community that de novo evolution of single resistance can be treated as negligible. The validity of this assumption will depend on the population under study.
Finally, multiresistance can be introduced to a hospital either via de novo mutations, horizontal gene transfer or importation from the wider community. Each of these mechanisms leads to different predictions and recommendations, which will be discussed in more detail on §4. For the time being, we consider two separate cases: the behaviour of the system either before multiresistance (R AB (t) = 0 = m AB ), and the behaviour of the system after multiresistance(R AB (t) > 0). The transition from R AB = 0 to R AB > 0 is discussed in §4.
Taken together, these processes lead to the following five compartment model: Figure 1. Schematic diagram of the compartment model, as defined in equations (2.1a)-(2.1e). The diagram shows the five model compartments and the flow of individuals between them. Mutation events are marked with dashed lines and are treated as being rare stochastic events. More common infection, recovery, immigration and emigration events are indicated by continuous lines; differing line thickness is used to indicate that some events are more common than others (for example, β S > β AB , τ + γ > γ). Line thickness is not to scale. Depending on the context, the R AB compartment of the above model will, or will not, be included in the analysis.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 The above system of equations assumes that the same antibiotic regime is applied to all patients: χ A (t), χ B (t) take the values zero or one. These assumptions are inappropriate however when administering different drugs to different patients. The most obvious way to represent intermediate prescription values in the above model would be to select χ A (t) = 0.5 = χ B (t); unfortunately, this would correspond to assigning all infectious patients both drugs half the time (for example drug A in the morning, and drug B at night). In clinical practice, this does not happen, and instead 'mixed' drug regimes refer to the practice of assigning half of the patients drug A across the entire course of their treatment, and the remaining patients drug B.
In order to properly model this, we use the seven compartment model of Uecker et al. [43]). In this model, we track both the resistance status of infections and the prescription status of the corresponding patients, splitting compartment . Because all prescriptions are assumed to be equally effective against susceptible bacteria, there is no need to split the S compartment, similarly with doubly resistant infections. χ A (t) and its complement χ B (t) no longer represent the probabilities of receiving a particular drug in the present, but instead the probabilities of being referred to a particular treatment group. The governing equations for R B A and R A A are with similar equations governing R A B and R B B . _ S and _ X are as defined in equations (2.1a) and (2.1e). We also consider an antibiotic switching rate q; this corresponds to the rate at which ineffective antibiotics are replaced by effective antibiotics in the case of single resistance. In the limit q = 0, patients are kept on their initial prescription indefinitely no matter what. In the limit q → ∞, patients are shifted to optimal treatment immediately. Parameter q can be thought of as a proxy for the intensity of testing for ABR within a given hospital system. A schematic of this behaviour is given in figure 2.
In what follows, Uecker's 7-box model is used when 0 < χ < 1 and Bonhoeffer's 5-box model is used when χ ∈ {0, 1}. See table 1 for a list of parameters, their descriptions and (where appropriate) default values.

Common antibiotic management protocols
In this article, we will compare three main antibiotic management protocols frequently discussed in the literature: combination therapy, mixing and cycling (mixing and combination therapy are most common in clinical practice). These protocols define which antibiotic is initially prescribed to a patient when they are admitted to the hospital, or after surgery. This prescription may later be changed based on either patient recovery (or lack thereof ), or when ABR infection is detected (see, for example, the Dutch 'search and destroy' policy for ABR [49]).
The simplest of these protocols, combination therapy, prescribes both A and B to all infected patients at all times. This approach is intended both to improve patient outcomes and to prevent multiresistance from arising by eradicating single resistant strains as quickly as possible. Unfortunately, combination therapy leads to increased direct costs and potentially heavier side effects. It also increases total drug prevalence, leading to concerns that it may encourage broad spectrum antibiotic resistance [50]. This duel action of increasing total antibiotic prevalence, but rapidly quashing single resistant strains, leads to some uncertainty with respect to the net effect of combination therapy on ABR. In this study, combination therapy is represented using Bonhoeffer's 5box model, with χ A (t) = χ B (t) = 1.
Mixing protocols assume that each patient is assigned either drug A or B with some probability, usually (but not always) χ A = χ B = 0.5. While many papers [33,36,51] have studied mixing using Bonhoeffer's 5-box model (or similar), Uecker's more detailed 7-box model is more faithful to clinical reality and will be the model used here whenever mixing is discussed.
Cycling protocols treat all patients with the same drug at any point in time, and switch back and forward between two (or more [35]) treatments every T days, preferentially treating with drug A for royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 the first T days, and with drug B for the next. These time periods are generically (though not always [52]) assumed to be equal. Mathematically, cycling is represented by . Both the 5-box and 7-box model can be used to represent cycling; which is more realistic will depend on the exact implementation of cycling used in clinical practice. For the sake of analytic accessibility, we study cycling in the context of the 5-box model. Basic simulation experiments indicate that the difference between the two models is negligible, except in the case of short cycle times, where fast cycling behaves like mixing.
Other, more detailed, management protocols have been studied. Beardmore & Peña-Miller [53] make use of detailed control theory techniques in order to determine optimal aperiodic antibiotic rotation protocols, cutting through the more heuristic approaches used elsewhere in the literature. Kouyos et al.'s [37] numeric exploration expands the space of possible treatment protocols by considering 'informed switching' protocols adapted for the stochastic hospital environment. Consideration of these more complex protocols is beyond the scope of this paper, but we do recommend these past works to the interested reader.

Optimization criteria
Broadly speaking, antibiotic protocols seek to achieve two conflicting goals: to maximize patient health and minimize the rate at which resistant bacteria (especially multiresistant bacteria) arise and develop. While these goals are easy enough to understand in an intuitive sense, there have nonetheless been a number of different formulations mathematically; each with their own strengths and weaknesses. For the sake of brevity, we can not consider all possible optimization criteria ever considered in the literature; our goal here is to present a representative sample. Additional optimization criteria, such as exponential discounting (as used in, for example [54]) and Bonhoeffer's G 1/2 are discussed in appendix D and B.4, respectively.
The most straightforward evaluation criterion is simply to maximize the number of 'healthy' patients over a given time frame (most often, 1 year): XðtÞ dt: This evaluation criterion is illustrated in figure 3a. It was used in Bonhoeffer's initial exploration of antibiotic protocols [32] (albeit with a constant offset). Under reasonable assumptions, it can be shown that maximizing X is equivalent to minimizing patient fatalities and minimizing total hospitalization time (appendix A). royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181

S R A
As an evaluation criterion, X 365 runs into two difficulties: firstly, it is explicitly 'blind' to the time of multiresistance introduction. Secondly, selection of different time windows or initial conditions may lead to different results; long time horizons emphasize the eventual steady state, while shorter time horizons are informed by initial transient behaviour. It is not always clear which time window is most appropriate.
Interpretation of these parameter differs between the 5-box and 7-box model.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 If our interest is primarily in the arrival dynamics of multi-resistant bacteria, then we may instead attempt to maximize T 1/2 , the time at which half of all infections are doubly resistant (R AB ). While interesting and meaningful from an evolutionary standpoint, T 1/2 makes a poor optimization criterion; maximization of T 1/2 generically leads to withholding all antibiotic use, thus delaying the takeover of doubly resistant mutants indefinitely. This optimization criterion is illustrated in figure 3b.
One possible balance between these two conflicting goals is to instead maximize This is illustrated in figure 3c. While providing some balance between maximizing health and time, it is doubtful that X T provides the correct balance between X 365 and T 1/2 . In effect, X T assumes that we will have no healthy patients from T 1/2 onwards. It seems unlikely this was the intent of authors using this criterion, but it is the implicit result. Much like T 1/2 , X T is liable to recommend non-treatment, as integrating even small numbers over an infinite time window gives 'optimal' results. Also, as mentioned by Uecker & Bonhoeffer [41], the criterion completely ignores the effects of a particular epidemic protocol on the time after the emergence of R AB . Because X T depends on the exact time course of X(t), there is also the risk that it will give conflicting answers for systems with different initial conditions. Because the exact initial conditions for a particular hospital are not knowable in advance, it would be preferable to avoid such sensitivity. Figure 3. Comparison of different optimization criteria. Here, we consider some hypothetical hospital for which the number of uninfected patients X(t) bounces around at high levels before the introduction of a small number of multiresistant infections at time T e . At this stage, multi-resistance increases to high prevalence (T 1/2 ), while the number of uninfected patients drops, oscillating around a low equilibrium value, X AB , for the rest of time. (a) the X 365 optimization criterion attempts to maximize the total number of uninfected patients (area under the curve) up until a given time (usually t = 365.) (b) The T 1/2 optimization criterion seeks to maximize the time taken until multiresistance takes over half the population, R AB = 0.5. (c), X T aims to maximize the number of uninfected patients up until T 1/2 . (d ) X T Ã maximizes the gain in uninfected patients relative to the multiresistance equilibrium, prior to multiresistance. Rather than integrating over X(t) we instead calculate the X equilibrium both before ( X) and after ( X AB ) the introduction of multiresistance. While this criterion may seem unusual, one step removed from a direct integration of X, unlike the other criteria it does not depend on initial conditions. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 All three of these difficulties can be avoided by instead considering the novel optimization criteria (as illustrated in figure 3d) Here, X indicates the long-term average of X prior to multiresistance, X AB is the long-term average after multiresistance occurs and T e denotes the first time that the multiresistant population reaches some low level e, potentially set so that e represents a single multiresistant infection. The use of long-term averages removes the effects of initial conditions and makes X terms more analytically accessible. In the case of cycling, these averages are taken over the course of one entire cycle; in the case of 'static' protocols, these averages are the equilibrium values of X.
X T Ã can be thought of as being 'formally equivalent' to optimization over the integral X AB dt subtracted off so as to render the optimization criteria finite. In some sense, our goal is not to maximize the total number of healthy patients over some time window (which can be manipulated via manipulation of the time window), but instead the increase in health due to the use of antibiotics, over all time.
Subtracting off X AB also serves to forbid certain degenerate strategies where X , X AB . These strategies represent protocols which give worse patient outcomes than the threat of multiresistance itself (for example, never using antibiotics), but delay multiresistance indefinitely (T e , T 1=2 ! 1). Protocols of this kind tend to give 'infinite value' according to time focused criteria such as X T and T 1/2 . By contrast, X T Ã assigns a negative score to such protocols ( X À X AB , 0). We will discuss this detail more thoroughly in §4. See figures 7 and 8.
It is important to note that no assumption is made that X and X AB are achieved using the same management protocols, hence (for example) it is possible that a cycling protocol may be used prior to multiresistance, and a mixing approach afterward. X T Ã optimizes on the assumption that we pick the best possible protocol post multiresistance introduction.
T e is chosen over T 1/2 as our stopping criterion for technical reasons. See appendix B.4 for details, as well as a comparison between X T Ã and the G 1/2 optimization criterion put forward by Bonhoeffer [32].
See figure 3 for a schematic illustration of all optimization criteria. Optimization criteria are also summarized in table 1.

Mean X values
In order to make sense of X T Ã (and other optimization criteria), it will prove helpful to calculate both X and X AB . Rather than calculate numeric integrals of X for a variety of initial conditions and parameter values (as has been done in previous papers [33,35,36]), our goal in what follows is to find asymptotic approximations for a variety of parameter regimes, making use of different simplifying assumptions in each case. See figure 4 for sample trajectories in each of these regimes.
Let us start by considering the case of combination therapy prior to the introduction of multiresistant bacteria. In this case, assuming antibiotics are effective at keeping down infection (τ + μ) ≫ β i X for each β i , all infected compartments can be well approximated by the balance between immigration and recovery and The total population of the ward is given by s ¼ P m i =m, and hence the long-term equilibrium of X is well approximated by royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 In the case of mixing ( prior to multi-resistance), we use the 7-box Uecker model (figure 2). In order to calculate the long-term equilibrium behaviour, we first calculate what fraction of cases are receiving effective treatment for each strain. Effective treatment ratios can be shown to be equal to This in turn leads to three different approximations of X at equilibrium, depending on which resistant strain is dominant The true equilibrium value for X is well approximated by the smallest of these three values. If X A mix is smallest, this indicates that R A is the dominant infection strain, the current limiting factor on improved health. If X S mix gives the smallest value (usually for high treatment correction values, q), this indicates that infection is dominated by disease importation; treatment within the hospital is close to optimal and ABR within the hospital is dominated by importation from the community. This situation might arise for example in situations where we have high levels of testing for antibiotics (such as the Dutch 'search and destroy' ABR policy [49]), or in situations where community levels of ABR are very high.
Illustrations of these results are given in figure 5. Derivation of these results can be found in appendix B.2. The next case to consider is the cycling case. We are interested in the long-term mean value of X averaged over precisely one cycle length: XðtÞ=T dt, where T is the length of a  Figure 4. General behaviour of the system when using mixing (a), combination therapy (b), or cycling (c,d). For the two 'static' strategies, (a,b), random initial conditions rapidly approach equilibrium for S and X. In the mixing case (a), R A and R B approach equilibrium slowly; for the parameters considered here, they become equal in limit as t → ∞; differences are purely based on initial conditions. For cycling (c,d ), background colour indicates the antibiotic currently in use. We consider two distinct regimes: 'fast cycling' (c), in which equilibrium is never reached, and 'slow cycling' (d ), in which R A , R B approach equilibrium values and are stabilized by importation from the community before drug switching occurs.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 single cycle (we consider t 0 → ∞ so as to ignore transient effects that might be caused by initial conditions. Alternatively, we might solve to find X(t) on the interval [0, T ] with periodic boundary conditions). For the time being, we consider the symmetric case, in which A and B have identical migration and treatment parameters (m A = m B , β A = β B ) and equal cycle time. Detailed calculations applicable to both the symmetric and asymmetric case are provided in appendix B.3. For cycling, there exist two major parameter regimes: a 'fast cycling' regime in which the equilibrium is never reached and a 'slow cycling' regime in which the system approaches its long-term equilibrium with each cycle (figure 4c,d). For sufficiently fast cycling, it can be shown that each antibiotic is used half the time, and hence, each applies at 50% effectiveness. For slow cycling, the long-term average approaches Here, C represents the transient 'spike' in X immediately following the change in treatment, while resistance to the new drug is rare ( figure 4). The effects of the spike are diluted across the length of a cycle. X represents the equilibrium value of X approached over the course of an arbitrarily long cycle, once resistance to the new drug has taken hold. Each drug has less than 50% effectiveness, because slow cycling results in resistance becoming ubiquitous in the population with each cycle; each drug spends most of its time treating the bacterial strain it is least effective against. So long as resistance importation is rare (m A , m B ≪ R A , R B ) both X and C can be calculated  Figure 5. Equilibrium values of uninfected individuals when using a mixing protocol. (a) As q (the drug correction rate) is increased, X increases approximately linearly (following the X A mix equilibrium), before approaching a maximal value at the X S mix equilibrium. (b) As drug selection probability χ A is varied, the equilibrium population X increases to a maximum and then decreases, passing from a X A mix limited regime to a X B mix limited regime. Here, we select drastically different β A , β B , so as to illustrate the effects of asymmetric infection rates. In practice, these rates can be expected to be approximately equal, and χ A = 0.5 gives close to optimal results. For both figures, the solid blue line represents numeric estimation of X at equilibrium, calculated using Newton's method. The black dashed lines are the upper bounds, as given by X royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 As previously, σ is the total population.
R and S are the equilibrium population size for susceptible and resistant strains, assuming appropriate treatment (so, R A being treated with B, or R B being treated with A).
For all T, X is best approximated by the minimum of X slow cycle , X fast cycle , see figure 6. This approximation is generally fairly tight, except in the boundary region where X slow cycle % X fast cycle . We refer to the boundary between fast and slow cycling as the 'saturation time', denoted t sat . Saturation time can be calculated by setting X slow cycle ¼ X fast cycle , and solving The final parameter regime to consider is the equilibrium in the presence of multiresistant bacteria, that is to say R AB > 0. In this case, it is easy to show that R AB must increase until This result applies regardless of which treatment protocol is employed. Strategies that would receive higher X in the absence of multiresistant infections will instead approach an equilibrium or cycle with mean value close to X AB . Protocols with worst health outcomes than the best possible response to multiresistance ( X , X AB ), such as withholding antibiotics entirely, will retain their low X value, and will result in the R AB population decaying at a rate proportional to X AB À X (see appendix B.4 for details).

Introduction of multiresistance
Analytic calculations of X before and after the introduction of the doubly resistant strain provide some measure of the relative ranking of different ABR protocols. In order to evaluate Ð Te 0 X À X AB dt however, we must also consider when double resistant bacteria are introduced, that is to say, we must estimate T e . Because X T Ã is linear in T e , it is sufficient to estimate the expected value of T e .
Doubly resistant strains are introduced to a hospital via a number of different channels, each with its own unique rate constant. For example, if a doubly resistant strain is imported from outside the hospital at some constant rate M import = m AB , then the expected introduction time of doubly resistant infection is constant, and independent of the antibiotic management protocol currently in use in the focal hospital (though potentially dependent on other hospital or inter-hospital policies [55]).
Multiresistant strains can also arise via de novo mutations. Such mutations can be modelled as either dependent on selective pressure (M select ¼ n select [34] or entirely independent of selective pressure (M base ¼ n base B R A þ n base A R B ) (one of multiple possibilities originally considered by Bonhoeffer et al. [32]). Finally, multiresistance may also occur via horizontal gene transfer between existing strains; the case was considered by Bergstrom et al. [33], where the assumed multiresistance is produced at a rate proportional to M HGT = ν HGT (R A × R B ). More detailed models of  Figure 6. Comparison of X as calculated numerically, versus the approximation X % minðððm þ g þ t=2Þ=bÞ, X þ C=TÞ. Analytic results provide an accurate approximation of X for most T values, except in a small window on either side of the 'saturation' time t sat .
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 HGT, based on within host dynamics, have been studied (for example Roberts et al.'s recent work [56]). For the time being, and for the sake of simplicity, we follow in the footsteps of Bergstrom et al. The relative importance of each of these four channels can vary from pathogen to pathogen, and will also depend critically on antibiotic stewardship policy across the surrounding region; careful stewardship may mean that M import is minimized and we are mainly interested in de novo mutations or gene transfer. In a region with high ABR prevalence M import may dominate.
While each of the above are reasonable assumptions they are all, in some sense, 'cartoon' approximations of exceptionally complex processes. Future investigation into the various sources of multiresistant infection may well suggest improvements upon the above terms, or even introduce new terms representing previously ignored channels such as horizontal gene transfer from a patient's commensule bacteria [57]. At the very least, it would be useful to determine the relative contribution of each channel in clinical practice. Such questions, however, are far beyond the remit of this simple mathematical analysis. For the time being, we treat each of the above M i as given and assume that in any given circumstance one channel of multiresistance dominates all others.
Our goal in what follows is not to make any specific or universally applicable policy recommendations, but instead to examine the types of recommendation made by various optimization  Figure 7. (a,b) The expected values of X 365 for high and low mutation/importation rate ν. Each line indicates a different channel for introduction of multiresistance. (c,d) Expected values of T 1/2 and X T . As might be expected, both optimality criteria tend to infinity as χ A nears 1 or 0, indicating that for sufficiently extreme values, R AB will never account for half the infected population. This applies regardless of the multiresistance introduction channel. (e) X T Ã for a variety of χ A values. ( f ) X T Ã rescaled such that max (X T Ã ) = 1. This rescaling makes maxima more clearly identifiable, and is as mathematically valid as any other scaling, seeing as comparing ν between different multiresistance introduction methods is inherently meaningless in the context of the current model.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 13 criteria under the influence of different M i . We are in some sense evaluating the optimization criteria themselves, rather than the management protocols on which they act. Rather than attempt a full of exploration of the entire parameter space, we focus on two straightforward test cases in order to illustrate the types of behaviour generally observed.

Comparison of optimization criteria: mixing
In what follows, we make use of Uecker's 7-box model, with parameter values β S = 1, β A = β B = 0.99, β AB = 0.98, q = 0. We are interested in the recommendations made by the four optimization criteria X 365 , T 1/2 , X T and X T Ã with respect to the mixing rate parameter χ A . For the sake of breaking symmetry slightly we set m B = 2 m A . Results in the q > 0 case are explored in appendix D. A full description of all parameter values and simulation methods is made available via github [58], and is saved in the file 'EvaluatingAllOptimalMetricsMixing.m'.
In all cases, we assume that R AB (0) = 0 and that multiresistant mutants appear according to a Poisson process with rates proportional to M import , M base , M select or M HGT . Results for each of the optimization criteria are presented in figure 7. Both X 365 and X T Ã have local maxima near χ A = 1/2 for M import , M base , M select . For M HGT , these two optimization criteria have sharp local minima near χ A = 1/2, with their maxima off centred (to differing degrees). This makes sense: if resistance is primarily formed via horizontal gene transfer, than the primary goal of any management protocol is to avoid regions of treatment space where R A and R B coexist. X 365 and X T Ã differ in how far from χ A = 1/2 one must move in order to reach optimal results. By contrast, X T and T 1/2 both recommend extreme values of χ A , tending to infinity precisely in those areas where X T Ã < 0. Once again, this matches expectation: if X , X AB then R AB cannot increase in prevalence, and hence T 1/2 = ∞.
When dealing with the optimality criteria X T Ã, X T and T 1/2 , ν acts primarily as a scaling constant (with a constant offset in the case of X T and T 1/2 , caused by the delay between T 1/2 and T e ). In the case of X 365 , royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 by contrast, the value of ν acts to determine the competition between multiresistance arriving and the year ending, leading to qualitatively different results depending on the exact value of ν ( figure 7a,b). Fortunately, these changes have at most modest effects on the optimal mixing ratio, even for X 365 . Hence, precise knowledge of ν is not crucial. The important lessons from figure 7 in terms of criteria comparison are that X T and T 1/2 are virtually indistinguishable, and can be seen to optimize in almost precisely the opposite direction to X 365 and X T Ã. X 365 and X T Ã are broadly similar in their overall behaviours, though not always in their precise recommendations.

Comparison of optimization criteria: cycling
We next consider the implications of different optimization criteria in the context of antibiotic cycling. In this case, the 5-box model is used, with parameter values β S = 1, β A = β B = 0.99, β AB = 0.98. Rather than attempt to explore the entire parameter space of possible cycling protocols, we here assume symmetric cycling times, and equal importation rates (T A = T B , m A = m B ), see appendix D for comparisons in the asymmetric case. We are interested in determining the expected value of X 365 , T 1/2 , X T and X T Ã for a variety of cycling times T. Expectation is taken over all possible introduction times of the multiresistant mutant, and also over the 'initial phase' (how far through the cycle the system is at t = 0). Initial phase is selected uniformly at random between 0 and 2T. Full code is available via github [58], and is saved in the file 'EvaluatingAllOptimal_cycling5box.m'. See figure 8 for results.
Once again, X T and X T Ã give conflicting recommendations. Much like the mixing case, X T and T 1/2 both recommend extreme parameter values: both metrics increase monotonically with cycle time T and tend to infinity as T → 3957.86; this is the smallest value for which X , X AB . By contrast, both X 365 and X T Ã are maximized for intermediate values of T. For X T Ã, optimal cycle times for all multiresistant introduction channels cluster around t sat . This makes sense-t sat denotes the largest T value that can be used without suffering reductions in X. Any M i that is minimized by increasing cycle time T can be increased up to t sat at no cost. Any increase beyond t sat inevitably comes at the cost of a reduction in X. Because the transition between X slow cycle and X fast cycle at t sat is not sharp, the exact location of the maximum of X T Ã varies slightly depending on the source of multiresistance. For M base and M import , X T Ã is maximized just below t sat , for M HGT and M select , the cost of switching antibiotic is higher, and X T Ã is maximized for T slightly larger than t sat . This clustering of optimal results around t sat is stable to variations in parameter values; see appendix C.
Qualitatively, X 365 gives results that are similar to X T Ã in some ways, but not identical. Overall, X 365 can be described as 'lumpier'; this higher complexity results from the interaction between three different timescales: the timescale of cycling, the timescale of ABR introduction, and the timescale of the integral (one year). When ν values are very small, the probability of a mutation occurring within 365 days becomes small. In this case, X 365 approaches 365 X. By contrast, when ν values are large, mutation occurs within one or two cycles. In this case, the approximation M ¼ M is no longer appropriate and the exact phase of cycling at the start of the simulation has a significant impact: for example, if using M HGT then multiresistance most commonly arises during the time of drug switching. Whether t = 0 occurs before or after a switch can have significant impact on the time until multiresistance arrival, and hence on the overall shape of X 365 . The large ν regime for X 365 draws attention to transient effects caused by initial conditions. These time-scale effects are ignored by X T Ã, which considers only longterm averages of M, and has no predefined 'end point' at 1 year.

Comparison of optimization criteria: combination therapy
The final antibiotic management protocol that we consider is combination therapy, in which both antibiotic treatment options are used simultaneously for all infections. Unlike the previous two cases, where we have a parameter value to vary over, combination therapy has no such parameter values: it is, in essence a single protocol, as opposed to a wide family of protocols. As such, there is no good way of comparing combination therapy to itself, and instead the protocol must be compared to the best results from the previous protocols. The results below are based on solving equation (3.4) and comparing to the optimal results for mixing and cycling. Full code for this work can be found on github [58] included as a special case in the file 'EvaluatingAllOptimalMetricsMixing.m'. The parameter values assumed were μ = 1/5, β S = 1, β A = β B = 0.99, β AB = 0.98, γ = 1/10, γ + τ = 1/2.5, other import/export parameters can be found in the file itself, as needed.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 Using either cycling or mixing, X T and T 1/2 can be pushed towards infinity for suitably extreme parameter values (χ → 0 or 1 in the case of mixing, T → ∞ in the case of cycling). Combination therapy allows no such 'infinite optimization' for either of these protocols, and hence, for any protocol focused on the dominance time of multiresistance, must be considered strictly worse. This is in line with results by Obolski & Hadany [34], who rank protocols based on the emergence time of multiresistance, and conclude that cycling is preferable to combination therapy.
For X 365 , with either large or small ν values, combination therapy is of the order of 2-3 times better than optimal mixing and optimal cycling for M base , M select and M HGT (cycling beats mixing for M base , M import , but is inferior for M HGT , though in most cases, differences are marginal). For M import , we find that optimal cycling is superior to combination therapy, which is superior to optimal mixing, regardless of ν.
For X T Ã however, we find that combination therapy dominates all other strategies, regardless of M. For M import , combination therapy gives X T Ã three times larger than both optimal cycling and optimal mixing. For M select , we find a approximately 190-fold increase, for M base , combination therapy is approximately 500 times better than optimal (symmetric) cycling, and approximately 700 times better than optimal mixing. In the case of M HGT , combination therapy gives X T Ã values more than 1000 times greater than optimal mixing, which is in turn more than 10 times greater than optimal symmetric cycling (this difference is reduced for asymmetric cycling, but in this case, the optimal cycling times tend to zero, indicating that mixing is the superior strategy, see appendix D for details). The large ratios involved here should be read with a degree of caution: we do not claim that combination therapy is hundreds (or thousands) of times 'better' than mixing or cycling strategies. By its nature X T Ã takes values closer to 0 than many of the other evaluation criteria, hence exaggerating the ratio between different values. X 365 may be preferable for more physically meaningful comparisons between combination therapy and other protocols. That said, for models considered here it is clear that X T Ã unequivocally favours combination therapy over all other protocols.

Discussion
Antibiotic resistance, and the proliferation of multi-resistant bacteria, pose significant challenges to modern healthcare systems, threatening to roll back the past century of antibiotic research [16]. Antibiotic management protocols are designed with the goal of improving patient outcomes while preventing (as much as possible) increases in resistance. The question of how best to represent 'good outcomes' mathematically runs into certain difficulties: individual optimization criteria often run at cross purposes and in many cases entirely contrary to one another.
Based on our explorations in §4, it seems likely that criteria intended primarily to delay the arrival of multi-resistance (T 1/2 and X T ) should be avoided in most circumstances. These criteria tend to infinity precisely in those regions where health outcomes are worse than the long-term impact of multiresistance itself. This may be appropriate in certain cases: when dealing with mild, short-term illness, antibiotic stewardship may be prioritized over immediate recovery [59]. This is not generically the case considered in this article however, where our focus has been antibiotic policy for preliminary antibiotic allocation for inpatients at a hospital ( prior to more detailed ABR testing). With this in mind, it would appear that time maximizing optimality criteria are most often actively harmful to the patient population, both in the short and long term. It is precisely these time maximizing optimality criteria that recommend against combination therapy [36]. In all other cases, when combination therapy is considered, it is found to be superior to both cycling and mixing based protocols. Whether or not this rather strong result holds up when considering additional factors such as monetary cost or bystander selection is currently unknown.
In order to balance the value of delayed multiresistance with improved health outcomes, we construct the novel optimization criterion, X T Ã ¼ Ð Tc 0 X À X AB dt; this is in some sense equivalent to optimizing Ð 1 0 XðtÞ dt, albeit with the 'constant' Ð 1 0 X AB dt subtracted off so as to avoid infinities. By definition, the criterion only gives positive values for protocols that improve patient outcomes relative to a ward dominated by multiresistant bacteria.
Asymptotic arguments allow us to calculate the mean number of uninfected patients, X, for a variety of cases, both pre-and post-multiresistance arrival ( §3).
X AB can be shown to be independent of management protocol, while X is protocol dependent. In almost all approximations of X, the infection rate β is a key parameter. While we have generally discussed differences in β as being metabolic costs of resistance, it is important to note that these costs can be imposed 'artificially' through targeted royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 isolation of infected individuals. This is suggestive of the critical importance of such clinical measures as improved hygiene practices and rapid diagnostics and isolation of ABR cases [48,49,60].
In all cases, we find that our results are sensitive to ABR importation rate m A , m B and m AB , that is to say, the prevalence of ABR in the community. When using cycling, we find that optimal cycle times for X TÃ scale with t sat , the so-called 'saturation time' above which which X rapidly decays ( figure 8). The exact position of optimal cycling relative to t sat depends on the source of multiresistant infection (horizontal gene transfer, spontaneous de novo mutation, or selective de novo mutation). Because t sat depends on the prevalence of ABR in the community, knowledge of the local community may be crucial for selecting optimal cycle times.
Recommendations for optimal mixing depend on the means of multiresistance introduction: balanced 50:50 mixing gives good results when multiresistance is either imported, or generated through de novo mutation and poor results if multiresistance arises via horizontal gene transfer ( figure 7). Further empirical work will be needed in order to determine which of these channels is most significant.
While the discovery of t sat , and its use in estimating optimal cycling times is a nice result, there are (inevitably) a number of caveats, conditions and stones still left unturned.
Firstly, it is worth reiterating that we have deliberately based our work here on a classical model [32]. This was done to allow cleaner comparison, and give readers a baseline that we hope they are familiar with. With that said, science moves on. Our modelling tools and understanding of biology have improved since Bonhoeffer's publication. The significant impact of phenomena such as bystander selection [61] are better understood, and there has been a more recent push toward integrating both within host and between host models [62]. Similarly, in the area of clinical policy, diagnostic-informed strategies such as the 'search and destroy' ABR policy implemented in the Netherlands [48,49,60] have been mentioned only briefly in this article. Such diagnostic informed policies can be hugely powerful, and were neglected here simply because it was our intention to model pre-diagnostic ABR policy (in line with the literature we wished to comment upon).
As a more mathematical caveat, many of the asymptotic results here are made on the assumption that both m A and m B are rather small; ABR spread is dominated by infection within the hospital (nonsocomal infection). Outside of this parameter range, the results presented here may be less relevant. The second mathematical limitation in the present research is the assumption of continuously varying population; given the relatively small size of a hospital (dozens to hundreds of individuals), this continuity assumption is at best suspicious, especially when many key dynamics of the system occurring when R A (t), R B (t) ≪ 1. It would be interesting to explore these results in the stochastic context. It seems likely that some analogue to both X AB and T e can be determined, though it is far from clear that X AB will be independent of ABR protocol in the stochastic case. Similarly, the addition of a third antibiotic is also predicted to change the behaviour of the system, and the appropriate definition of X AB . Despite these complications, we would still posit that some optimization criteria conceptually equivalent to X T Ã are likely to prove useful in all of these cases. While X was found analytically for all management protocols, calculation of mutant arrival rates relied on numeric solutions of ODEs, and no analytic approximation looks forthcoming. This problem is further exacerbated by the fact that it is not even clear which M function is appropriate, and the possibility of horizontal gene transfer between native commensule bacteria carrying resistance genes and invasive pathogenic bacteria [63][64][65] raises the possibility that none of the M functions explored here give reliable results. Determination of which M function should guide selection of cycle time is an empirical question rather than a mathematical one. We do note, however, that while seldom optimal, T = t sat is considered 'good' for all channels of multiresistance considered here.
Throughout the literature, numerous antibiotic deployment protocols have been proposed, each with the conflicting goals of maximizing patient health and maintaining antibiotic effectiveness. Our focus throughout this article has been to compare these protocols and (more crucially) to compare the various optimization criteria used to rank them. We find criteria that are overly focused on antibiotic stewardship (T 1/2 , X T ) tend to recommend patient outcomes which are worse than the long-term outcome of multiresistance, and hence are harmful to patients both in the short and long term. For this reason, we recommend against the use of such optimization criteria. We find that optimization criteria which are more patient-centric (X 365 , X T Ã ) generally recommend combination therapy as the best method of preventing the creation of multiresistance inside the hospital, but may occasionally recommend cycling if multiresistance is primarily introduced from the community. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Competing interests. We declare we have no competing interests. Funding. A.J-L. acknowledges funding by the School of Medicine and HealthScience, Carl von Ossietzky Universität Oldenburg, https://uol.de/en/medicine. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. B.B. received no specific funding for this work.
Acknowledgements. We gratefully acknowledge discussion and feedback from Hildegard Uecker and Arne Traulsen, who provided valuable feedback on an early draft of this note, contributing to both clarity, context and focus.

Appendix A. Minimizing mortality and hospital stay
Along with the question of 'which integral of X best represents our ABR goals?', Uecker & Bonhoeffer also raise the question of whether or not integrals of X are the best starting point for any evaluation criteria, referencing the three competing goals of 'disease prevalence, mortality rate, length of hospitalization' ( [41], p. 13), after all, the most obvious clinical goal of a hospital is to minimize mortality, and assist in patients' speedy recovery. Maximizing the number of patients in the ward in the uninfected class is not (at face value) the same as minimizing the number of fatalities. In Bonhoeffer's original investigation [32] infection compartments represented the number of infections in the community-in this context maximizing the uninfected population is a fairly direct 'maximization of health'. However, later adaptions of the model [33] instead imagine each compartment as populations within a hospital, with immigration and discharge rates back into the community. The exposed class in these models represents patients who have no major infection, but remain in the hospital for other unrelated reasons (for example, post-operative care). In this context, it is not obvious that maximizing X should be our primary goal; instead a hospital director might want to minimize fatalities, or minimize the average number of patients (a rough proxy for the burden on the health system, and the amount of time patients spend in hospital).
In order to study alternative evaluation metrics, we extend the model in two ways; firstly, we separate emigration from the system into two streams: death and emigration. Secondly, we allow the death/ emigration rate to vary between infected and uninfected individuals. For the time being, all infections are assumed to have the same death/emigration rates, regardless of the ABR of any given compartment. This leads us to replace the exit rate μ with four parameters: d, d X , e and e X ; that is to say an infected and uninfected death rate, and an infected and uninfected emigration rate (via hospital discharge).
The total death rate at any given time is given by This total population is given by Noting that the total number of patients within the hospital is conserved, but for immigration, death and discharge, we find Integrating the above over an entire cycle (when cycling drugs) or solving for steady states (for mixing and combination therapy) gives us a relationship between the mean uninfected population and mean total population 0 ¼ X m i À ðd þ eÞ s À ðd X þ e X À d À eÞ X ðA 3Þ and s ¼ P m i ðd þ eÞ À ðd X þ e X À d À eÞ ðd þ eÞ X: ðA 4Þ Combining equations (A 1), (A 2) and (A 4) royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 For any reasonable illness, where e X > e, d X < d, the problem of minimizing death is equivalent to maximizing X (the two have a negative linear relationship). The relationship between s and X will be positive or negative depending on the sign of (d + e − d X − e X ). When d − d X < e X − e, improvements in discharge rate are larger than changes to mortality rate and minimizing hospital load is equivalent to maximizing X. When d − d X > e X − e, the 'best' strategy for minimizing hospital load is to maximize the spread of infection; however, in this case, hospital load is reduced purely through patient fatality. In such a case, maximizing X would still appear preferable according to any reasonable medical or ethical standard.
It is also worth noting that the various asymptotic results throughout this paper are determined by _ R A , _ R B and independent of σ and d X , e X , hence they will still apply even when d X ≠ d, e X ≠ e. The one exception is equation (B 12), a quadratic in S and σ; given the linear relationship between σ and X, this quadratic can be easily adapted for variable exit rates.
Hence maximizing X is an optimal strategy, both for minimizing mortality, and patient recovery time. While not exactly a revelation, it is reassuring to know that these goals are well aligned and in some sense 'equivalent' to one another; both downstream goals are not only monotone in X but also linear. These results may be more complicated in situations where death or discharge rates vary between different ABR classes, or if we are dealing with multiple different infections of varying severity, but based on what we have seen so far, all reasonable measures of success will generally point in the same direction.

Appendix B. Analytic approximations of X for various protocols
In this section, we give derivations for the various X approximations of the main text, that is to say equations (3.4), (3.7), (3.8) and (3.11). For each of these approximations, we describe the underlying assumptions, where the approximation is applicable, and how it can fail.

B.1. Combination therapy
In the case of combination therapy, all infected compartments are subject to at least one effective antibiotic. There exists a single stable equilibrium, which can be found numerically. Under the assumption that A and B are effective at keeping infection down (τ + μ) ≫ β i X for each β i , and all infected compartments can be well approximated by the balance between immigration and recovery: S ≈ m S /(τ + μ + γ), R A ≈ m A /(τ + μ + γ), R B ≈ m B /(τ + μ + γ). The total population of the ward is given by s ¼ P m i =g, and hence the healthy population is This gives equation (3.4) of the main text.

B.2. Mixing
Antibiotic mixing is best represent by the '7-box' model, as proposed by by Uecker & Bonhoeffer [43] (see §2 for details). As in the case of combination therapy, the equilibrium state for mixing can be found by solving Analytic solutions can be found by first determining what fraction of R A and R B are being treated effectively. This can be done by first noting royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 cancelling common terms (the top row) gives ðB 5Þ Let us call this ratio Þ, the effective treatment ratio. A similar ratio applies for A, Equipped with an analytic expression for how often R B is treated effectively, we can now determine for what values of X R B can be stable  ( 1), and X + S make up the vast majority of the population: X þ S % s ¼ P m i =m. Here, σ represents the total hospital population at equilibrium. For the particular model considered here, σ is independent of time and AB protocol.
The S dominated equilibrium can be found via the quadratic formula Taken together these three upper bounds (R A , R B or S dominated equilibria, equations (3.7a), (3.7b), (3.7c)) constrain the long-term equilibria X. Diagrams comparing the exact equilibrium (found numerically) to the approximations above for a variety of χ and q values are given in figure 5. As can be seen, all approximations are highly accurate within their domain of applicability. Near the boundary of these regions, where multiple constraints are approximately equal, multiple different strains of infection have non-trivial contributions, and X is correspondingly reduced, leading to a smooth transition.

B.3. Cycling
Let us now turn our attention to cycling protocols. Cycling protocols take one of two major forms: 'fast' cycling, in which the system is never allowed to reach equilibrium, and 'slow' cycling, in which the system reaches equilibrium with each cycle. Typical trajectories of such cycling protocols are given in figure 4c,d of the main text. Each dynamic regime allows for different simplifying assumptions and gives rise to different approximations of X (equations (3.8)). In the main text, we explore cycling protocols in the context of symmetric parameter values β A = β B , m A = m B and cycle times T A = T B . Here, for the sake of generality, we consider the alternative case, where these parameter values are not assumed to be equal. The symmetric case considered in the main text follows directly as a special case. In what follows, T A and T B indicate the amount of time during a cycle spent administering drug A and B, respectively (total cycle time T A + T B ). This gives royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 Let us first consider the case of 'fast' cycling. For fast cycling neither R A nor R B approach their equilibrium values and we can integrate over a full A/B drug cycle to find By periodicity log (R A (2T )) = log (R A (0)), and hence X X Similarly, integrating _ R B over a full cycle gives rise to Much like the mixing case, one of these two bounds will be smaller; the associated strain (either R A or R B ) will maintain a significant population at all times, while the other strain is pushed to low-population levels maintained only by immigration (figure 9a). Because m i ≪ R i at all times for the dominant strain, Ð TAþTB 0 ðm A =R A Þ dt % 0 and the X % X A=B fast . In the symmetric case with β A = β B , m A = m B , T A = T B , we recover equation (3.8a).
Let us now consider the 'slow' cycling case. This case is more complicated, because both R A and R B dip low enough such that m A /R A and m B /R B can no longer be ignored but also simpler, because the system returns to equilibrium with each cycle before the new drug regime can begin. In general, X spikes for a brief period following the introduction of each new antibiotic, and then quickly returns to some equilibrium level, X, which may depend on the antibiotic being used (denoted where C A represents the area beneath the spike and above the equilibrium when switching from drug B to drug A (figure 10). A similar integral applies when administering drug B. In the symmetric case, we refer to equilibrium values, X, and spike integral, C, without superscripts.
We begin by determining the relevant equilibrium in each part of the cycle. Let The equilibrium level of X while drug A is applied is X A ¼ ðm þ gÞ=b A , and similarly X B ¼ ðm þ gÞ=b B . These two equilibria determine the equilibria of the remaining populations, namely Note here that we use notation in a different way to the 7-box model previously discussed in the main text. While superscripts once again represent the antibiotic currently being applied, it is assumed that a royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 switch in the hospital's drug protocol will shift the entire population (not only freshly incoming patients, as in the 7-box model).
In order to calculate X over the time period [0, T A + T B ], we break the integral into two parts, and make use of the _ R equation for the resistant strain in each part, hence dividing through by R A and rearranging to make X the subject, Each of the terms on the right-hand side of the above can be approximated.
Directly after switching drugs, we have _ R A ¼ tR A ; all remaining terms sum to zero, as otherwise the pre-treatment equilibria would not be an equilibria. Hence, immediately following drug switching m A =R A % ðm A = R B A Þ e Àtt . See figure 11 for a comparison of this approximation to numeric solutions of the full ODE system. Straightforward integration gives Ð TA 0 ðm A =R A Þ dt % m A = tR B A . Taken together, these elements give And similarly, it can be shown B.4. Equilibrium post multi-resistance, and construction of X T Ã Suppose we wish to determine X AB , the long-term average number of uninfected individuals, in the presence of multiresistant bacteria. When using cycling therapy, we average X AB over a complete X t Ú X -X dt = C Ú X dt Figure 10. Schematic depiction of Ð X dt over one antibiotic cycle. The integral can be broken into two components; one that accounts for the equilibrium X value, and the other that accounts for the excess X levels immediately after switching.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 cycle. If using mixing or combination protocols, X AB is instead equal to the long-term steady-state value of X(t). We would like to determine this value under the assumption that R AB > 0. While it is possible to find steady states in the case of mixing and combination therapy, solving equations (2.1) analytically for cycling protocols is not (generically) possible. Fortunately, it also proves unnecessary.
Consider equation (2.1d) Assuming that multiresistant bacteria are rare in the community (m AB ≪ 1) and mutation events producing R AB strains are rare, it is possible to divide through by R AB and integrate. This is the approach used by Bonhoeffer et al. [32] in their original paper, although here we will emphasize and make use of the results in a somewhat different manner Both m AB and R AB are non-negative, hence Ð m AB =R AB dt ! 0. ½logðR AB Þ 0 regardless of AB protocol. Taken together, these facts imply The existence of this equilibrium was implied in [32], though never explicitly stated. Strategies that would receive higher X in the absence of multiresistant infections will instead approach an equilibrium or cycle with mean value close to X AB . Protocols that would otherwise result in X , X AB (such as withholding antibiotics entirely) will retain their low X value, and will result in the R AB population decaying at a rate proportional to X AB À X. (d ) the R A trajectory in logarithmic coordinates. As can be seen in both (a) and (d ), in the time window (25,50], when drug B is administered, the decay trajectory of R A towards m/τ is rather complicated; by contrast, when drug A is administered m/R A follows an approximately exponential curve of the form τ e −τt before converging toR A (see the smooth exponential curve in (b) and linear increase in (d )).
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 With equation (B 26) in mind, it is possible (through some abuse of notation) to 'integrate' X from t = 0 to ∞ While infinite, the term Ð 1 0 X AB dt is entirely independent of our antibiotic management regime, and hence can be ignored from the perspective of optimization (its derivatives are zero). Assuming a hospital does not pursue an elimination strategy for multiresistance, R AB will eventually approach R AB % s À X AB , hence the log term equation (B29) can also be considered constant (or approximately so). Finally, m AB /R AB is approximately zero whenever m AB ≪ R AB . Hence, optimization over all time will closely resemble optimization over the far more reasonable timespan ½0, T e .
It is worth discussing the similarities between X T Ã and the optimization criteria G 1/2 , as originally proposed by Bonhoeffer et al. [32]. G 1/2 is defined as 'the gain in uninfecteds before 50% of infecteds are AB-resistant', where 'gain' is defined as the gain relative to the no antibiotic use equilibrium. Stated in the notation of the current work, this would be written as where here X ; represents the 'no antibiotic treatment' equilibrium. This constant term À X ; is frequently dropped in works following up from Bonhoeffer et al.-an approach which is valid when considering integrals with a fixed endpoint such as X 365 , for which Ð X ; dt is itself constant, but significantly alters the optimization criteria for any integral with variable end point, such as G 1/2 and X T (which are 'identical', but for the offset term).
X T Ã differs from G 1/2 in two major ways. First, it uses a different 'offset' term: this change, while numerically minor, is also fairly critical for the sake of physically meaningful results. Whenever β AB < β S there will exist AB management strategies satisfying ðg þ mÞ=b S , X , ðg þ mÞ=b AB such that T 1/2 = −∞, leading to G 1/2 = ∞. Any protocol designed to achieve this intermediate X will be strictly worse than dealing with multiresistant infections directly, and yet this is precisely the strategy that G 1/2 will optimize towards. Note that no such degenerate results occur for offset terms slightly higher than X AB . The second difference between G 1/2 and X T Ã is the choice of integral endpoint: T 1/2 in the former case and T e in the later. This is done for two reasons: firstly, T e is easier to deal with analytically; it depends only on mutation rate, unlike T 1/2 which depends on both mutation and spread of R AB . Secondly, T e  Figure 12. Optimal cycle times. Coloured lines give optimal cycling protocols for each mutant arrival channel, with T = 0 equivalent to a 50 : 50 mixing protocol. The horizontal black line represents t sat , while the dashed vertical line represents β A = β B ; β AB values above this line represent multiresistant strains which are more infectious than their single resistant counterparts. As can be seen, optimal cycling strategies are broadly stable over a range of β AB values, with sudden changes in optimal strategy in the vicinity of various 'tipping points', for example β AB ≈ 0.95 when using M base in the 5-box system.  Figure 13. Optimal cycle times for a given importation rate. Coloured lines give optimal cycling protocols for each mutant arrival channel, with T = 0 equivalent to a 50:50 mixing protocol. The black line represents t sat . As resistance importation rate increases, optimal cycling time decreases. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 Figure 12 demonstrates how optimal cycle times vary depending on β AB ; optimal cycle times are relatively stable across a range of β AB values, but notably have sudden jumps, especially near β AB → β A = β B . Fortunately, these jumps correspond to regions of parameter space where many T values give almost optimal results. While picking truly optimal T becomes more difficult, selecting almost optimal T is relatively easy. Another important and less easily determined variable is the ABR importation rates m A and m B . Throughout this paper, these rates are assumed to be 'low', but how small, and how sensitive optimal switching times are to these parameters is a question of some interest, especially given our previous hypothesis that optimal arrival times scale with t sat , a function of m A , m B .
In order to investigate this, we run numerical simulations and calculate ð X À X AB Þ= M for a variety of m A = m B and T values, and then select the optimal T for each m ( figure 13). For each m, we also calculate t sat . Optimal T for both M base and M HGT are roughly parallel to t sat over several orders of magnitude, with T HGT ≈ t sat + 30, and T base ≈ t sat − 10. Optimal T for M select is found between these two values, though where it falls in this range varies depending on m. It seems likely that the offset between t sat and optimal T will vary depending on other system parameters. We observe remarkable consistency between both 5-and 7-box models; it would appear that antibiotic 'switching lag' in the 7-box model has minimal impact on optimal cycling time. This is likely the result of 'lag' being significantly smaller than switching time T, and thus not playing into the overall dynamics, so long as T is not too close to 0.

Appendix D. Further comparisons of optimization criteria
In the main text ( §4), we compared four optimization criteria in the context of both mixing and cycling. In each case, we made simplifying assumptions-namely, in the mixing case, we assumed zero treatment switching, q = 0, and in the cycling case, we assumed symmetric cycling and symmetric parameter values. Here, we (briefly) consider the implications of relaxing these assumptions. In figure 14, we consider mixing, with a drug correction rate q = 1/6. As might be expected, the inclusion of drug switching improves outcomes as measured according to X 365 , X T Ã, but reduces T 1/2 and X T (with the (c) ( d) Figure 16. Here, we use the 7-box model to compare '1 year' optimization (X 365 ¼ Ð 365 0 X dt) with 'exponentially discounted preferences' (X l ¼ Ð 1 0 XðtÞ e Àlt dt) for both high and low mutation rates. We compare under the assumption that drug correction rate q = 0, and with λ = 2/365, such that both X 365 and X l integrals have the same 'centre of mass'. We consider both higher and lower mutation rate. Each line indicates a different channel of multiresistance arrival. We observe that X l is qualitatively similar to X 365 , all be it with a angled 'peak' in the middle rather than a flat 'plateau'.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 220181 exception of X T under the influence of M HGT ). Once again, we observe that for X 365 , X T Ã, a mixing ratio close to 50:50 is recommended in all cases except M HGT ; in this case, the optimal χ A ratio remains only slightly offset for X 365 , the maximum value for X T Ã moves from χ A ≈ 0.25 to χ A ≈ 0, that is to say (for the parameter values considered here), horizontal gene transfer is best minimized by having a single 'front line' treatment, and only switching the patient to the second treatment if the first does not work. This contrasts with the case of 'no switching', where both antibiotics needed to be applied with non-trivial frequency in order to get optimal outcomes.
In figure 15, we also consider a single case of 'asymmetric cycling'. In this case, we assume that β A = β B , m A = m B and T A ¼ 3 2 T B . In general, it is found the asymmetry reduces X 365 and X T Ã, while increasing T 1/2 and X T . The one exception being the case of multiresistance arrival via horizontal gene transfer; in this case, fast asymmetric cycling significantly improves X 365 and X T Ã compared to the symmetric case.
One optimization criteria not discussed in detail within the main text is the idea of some 'exponentially discounted preferences': X l ¼ Ð 1 0 XðtÞe Àlt dt, (as used in, for example [54]). Exponentially discounting is common in the literature of economics, and has many nice properties, including analytic tractability, and avoiding 'sharp' cut offs observed for X 365 ¼ Ð 365 0 XðtÞ dt, which gives full weight to patients on Day 364, and cares nothing for patients on Day 366.
Preliminary exploration suggests that X l behaves in many ways like X 365 ; see figure 16. This makes sense, X l can be approximated by a superposition of X n , for many different values of n. It seems possible (but not likely) that the two metrics behave differently when using antibiotic cycling strategies. For the time being, such results remain analytically inaccessible and computationally expensive.